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Abstract 

Background: Recent findings have shown that up to 60% of pheochromocytomas (PCCs) 
and paragangliomas (PGLs) are caused by germline or somatic mutations in one of the 1 1 
hitherto known susceptibility genes: SDHA, SDHB, SDHC, SDHD, SDHAF2, VHL, HIF2A (EPAS1), 
RET, NF1, TMEM127 and MAX. This list of genes is constantly growing and the 1 1 genes 
together consist of 144 exons. A genetic screening test is extensively time consuming and 
expensive. Hence, we introduce next-generation sequencing (NGS) as a time-efficient and 
cost-effective alternative. 

Methods: Tumour lesions from three patients with apparently sporadic PCC were subjected 
to whole exome sequencing utilizing Agilent Sureselect target enrichment system and 
lllumina Hi seq platform. Bioinformatics analysis was performed in-house using commercially 
available software. Variants in PCC and PGL susceptibility genes were identified. 
Results: We have identified 16 unique genetic variants in PCC susceptibility loci in three 
different PCC, spending less than a 30-min hands-on, in-house time. Two patients had one 
unique variant each that was classified as probably and possibly pathogenic: NF1 Arg304Ter 
and RET Tyr791Phe. The RET variant was verified by Sanger sequencing. 
Conclusions: NGS can serve as a fast and cost-effective method in the clinical genetic 
screening of PCC. The bioinformatics analysis may be performed without expert skills. We 
identified process optimization, characterization of unknown variants and determination of 
additive effects of multiple variants as key issues to be addressed by future studies. 
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Introduction 

Pheochromocytomas (PCCs) and paragangliomas (PGLs) 
are rare tumours arising from chromaffin cells in adrenal 
medulla and autonomous ganglia. A majority of these 
tumours have a low proliferation and seldom metastasize. 
The understanding of underlying molecular mechanisms 
in the tumorigenesis of these diseases has increased 



dramatically during the last decade (1). Up to 80% of all 
PCC and PGL could have either germline or somatic 
mutations (2, 3, 4) in one of the 11 hitherto known 
susceptibility genes: SDHA, SDHB, SDHC, SDHD, SDHAF2, 
VHL, HIF2A (EPAS1), RET, NF1 , TMEM127 and MAX (5, 6, 7, 
8, 9, 10, 11). While there has been a constant flow of 
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reported new susceptibility loci, the capacity of instruments 
approved for diagnostic use has failed to keep up with the 
increasing demand. These 11 genes constitute 144 exons 
(~25 000 bases); consequently, a comprehensive PCC and 
PGL genetic screening test can be time consuming and is not 
regarded as cost effective (12). This has motivated the design 
of numerous screening algorithms to guide the investigators 
in the selection of appropriate patients and tests (12, 13). 
Spare use of clinical genetic screening in patients with PCC 
and PGL, despite the introduction of such guidelines, has 
been mainly excused by cost-benefit explanations. 

Introduction of novel sequencing techniques 
(denoted next-generation sequencing or NGS) has 
dramatically reduced the cost for DNA sequencing (14). 
The term NGS includes principally different sequencing 
platforms that share a high output of sequenced bases 
relative to traditional methods. Recently, the focus of 
experiments using NGS has been shifted from the research 
settings to investigate the use of NGS as a platform in 
clinical scenarios (15, 16, 17, 18). 

The NGS process is highly complex with multiple 
steps that may be divided into genomic enrichment 
(selected, all exons as in exome or none as in whole 
genome sequencing), sequencing (including library prep- 
aration), bioinformatics analysis and, in the clinical 
setting, genetic consultation (19). 

Due to its well-characterized genotype-phenotype 
correlation and the limitations imposed by existing 
technologies, there is a strong argument for investigating 
the potential use of NGS as a diagnostic test in the clinical 
genetic screening of PCC and PGL. 

Materials and methods 

Patients 

Tumour tissues from three patients with PCC were selected 
for whole exome sequencing. Patient characteristics are 
summarized in Table 1. All the three patients had a 

Table 1 Clinical characteristics of sequenced patients 



secretory unilateral PCC and no apparent signs/symptoms/ 
history suggesting pathogenic germline variants in known 
susceptibility genes. The local ethics committee approved 
the study and written informed consent was obtained 
from all patients. 

Exome capture and high-throughput sequencing 

All samples were macro-dissected to achieve neoplastic 
cellularity of > 80%. DNA was prepared from cryosections 
using Genomic-tip 20/G (cat. no. 10223, Qiagen). Sequen- 
cing libraries were prepared from 3 |ig gDNA using 
SureSelect target enrichment system for Illumina paired- 
end sequencing libraries v2.2, October 2010 (Agilent 
Technologies, Santa Clara, CA, USA), according to the 
manufacturer's instructions. Briefly, the DNA was frag- 
mented using the Covaris S2 system (Covaris, Woburn, 
MA, USA). The DNA fragments were end-repaired using T4 
DNA polymerase, Klenow DNA polymerase and T4 
polynucleotide kinase (PNK), followed by purification 
using AMPure XP beads (Beckman Coulter, Brea, CA, 
USA). An A-base was ligated to the blunt ends of the DNA 
fragments using the Klenow DNA polymerase and the 
sample was purified using AMPure XP beads. Adapters for 
sequencing were ligated to the DNA fragments, followed 
by purification using AMPure XP beads. The adapter- 
ligated libraries were amplified for five PCR cycles, 
followed by a second purification using AMPure XP 
beads. The quality of the enriched libraries was evaluated 
using the 2100 Bioanalyzer and a DNA 1000 kit (Agilent). 
Exon capture was performed from 500 ng of each 
sequencing library using the SureSelect Human All Exon 
50 Mb kit (Agilent). Briefly, the fragments in the library 
were hybridized to capture probes, unhybridized material 
was washed away and the captured fragments were 
amplified for ten PCR cycles, followed by purification 
using AMPure XP beads. The quality of the enriched 
libraries was evaluated using the 2100 Bioanalyzer and a 
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High-Sensitivity DNA-kit (Agilent). The adapter-ligated 
fragments were quantified by qPCR using the KAPA SYBR 
FAST library quantification kit for Illumina Genome 
Analyzer (KAPA Biosystems, Woburn, MA, USA). A 6 pM 
solution of the sequencing libraries was subjected to 
cluster generation on the cBot instrument (Illumina, 
Inc., San Diego, CA, USA). Paired-end sequencing was 
performed for 100 cycles in one lane using a HiSeq2000 
instrument (Illumina, Inc.), according to the manufac- 
turer's protocols. Base calling was done on the same 
instrument by RTA 1.10.36 and the resulting bcl files were 
converted to Illumina qseq format with tools provided by 
OLB-1.9.0 (Illumina, Inc.). Fastq sequence files were 
generated using CASAVA 1.7.0 (Illumina, Inc.). Additional 
statistics on sequence quality was compiled from the base 
call files with an in-house script (http://molmed.medsci. 
uu.se/SNP + SEQ+ Technology + Platform/) . 

Bioinformatics 

Sequencing generated a minimum of 125 X 10 6 reads in all 
three tumours with an average read length of 100 reads 
(Table 2). Generated sequences were processed using 
commercially available software: CLC Genomics Work- 
bench 4.9 (CLC Bio, Aarhus, Denmark). Reads from pair- 
end fragments were trimmed for low-quality and duplicate 
reads (Fig. 1). Remaining sequences were mapped to the 
human reference sequence GRCh37.p5. A single-nucleo- 
tide variant (SNV) and insertion/deletion detection 
algorithm was used with low- and high-stringency 
settings: low stringency, coverage of >8 reads and a 
variant allele frequency of >25%; and high stringency, 
coverage of >30 reads and a variant allele frequency of 
> 35%. Generated results were filtered for non-synonymous 
variants and/or variants with a probable splice site effect. 
The list was annotated for all gene annotations and then 
filtered for variants in one of the 1 1 currently known PCC 
susceptibility genes. The remaining variants were annotated 
for overlapping information in selected genetic databases: 
the Single Nucleotide Polymorphism Database (dbSNP), 
Catalogue of Somatic Mutations in Cancer (COSMIC), the 
Human Gene Mutation Database (HGMD) and Leiden Open 
source Variation Databases (LOVD). Impact of non-synon- 
ymous amino acid substitution was assessed in silico, using 
Polyphen2 (20) and SIFT (21). Cross-references were 
manually gathered when available. Analysis of structural 
variants in data generated by exome sequencing was not 
adequately supported by the software and was excluded 
from this experiment. 



Exome sequencing 



Remove duplicate reads 
+ quality trimming 



Mapping reads to 
reference (GRCh37/hg19) 



Variant calling 



Filter: non-synonymous 
variants + splice site 



Filter: PCC Susceptibility 
genes 



Annotation: allele 
databases 



Selective validation with 
Sanger sequencing 



Figure 1 

Bioinformatics pipeline for analysis of exome sequencing in the clinical 
genetic screening of pheochromocytoma. 

Sanger sequencing 

DNA was prepared from peripheral blood and tumour 
cryosections using DNeasy Blood and Tissue Kit (Qiagen). In 
order to be utilized as control and for verification of variants 
discovered by NGS, fragments corresponding to all exons 
and intron-exon junctions of major susceptibility genes; 
SDHB, SDHC, VHL, MAX, RET (exons 10, 11 and 13-16) as 
well as selected fragments in NF1 (exon 9), were amplified 
by PCR and sequenced using automated Sanger sequencing 
(Beckman Coulter, Takeley, UK). Primer sequences and PCR 
conditions can be obtained by request. 

Results 

Exome sequencing of three PCC tumour lesions generated 
a read coverage of 1 X (98-99%), 10 X (94-96%) and 100 X 
(35-77%) for bases annotated by PCC susceptibility genes 
(Table 2 and Fig. 2). A total of 30 and 19 variants were 
identified with low and high variant-calling stringency 
respectively (Supplementary Table 1, see section on 
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supplementary data given at the end of this article). In low 
stringency, this corresponded to 16 unique variants. One 
was assessed as probably pathogenic, one as possibly 
pathogenic, four as benign and 11 as unknown. RET 
Tyr791Phe and NF1 Arg304Ter were each found in one 
patient and were assessed as either possibly or probably 
pathogenic. Patient 1 had all variants classified as benign 
or unknown, including one previously uncharacterized 
variant in SDHC: ProllOSer (Supplementary Figure 1, 2 
and 3, see section on supplementary data given at the end 
of this article). RET Tyr791Phe and SDHC ProllOSer were 
verified by Sanger sequencing in both blood and tumour 
tissues. Comparing with results from Sanger sequencing of 
SDHB, SDHC, VHL, RET (exons 10-11 and 13-16) and MAX 
as control, there were no false negatives generated by NGS 
(Supplementary Table 1). 

Patient 1: SDHC variant of uncertain clinical significance 

A 61-year-old woman was investigated due to therapy- 
resistant hypertension of unknown aetiology. Urine nor- 
adrenaline level was elevated. The patient was operated with 
a laparoscopic left-sided adrenalectomy and the pathology 
report described a benign PCC, 25 X 20 mm in size and a 
weight of 4.5 g. Immunohistochemistry demonstrated 
expression for chromogranin A and a Ki67 index of 1%. 
Exome sequencing revealed seven SNVs, one was classified 
as benign and six as unknown. There was one missense 
variant in SDHC located at position 477C<T, resulting in 
amino acid substitution ProllOSer. This variant was not 
found in the HDMD, dbSNP, COSMIC or LOVD databases 
nor could it be found in a PubMed search. In silico analysis 
using Polyphen2 and SIFT estimated SDHC ProllOSer as 
benign (score 0.231) and tolerated (score 0.93). 
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Patient 2: RET variant of uncertain clinical significance 

A 27-year-old woman was investigated post partiim due to 
therapy-resistant hypertension during the second and third 
trimesters. The patient had elevated urine noradrenaline and 
adrenaline levels. She was operated with a laparoscopic right- 
sided adrenalectomy and the pathology report described a 
PCC, 50 X 50 mm in size with a weight of 54 g. Immunohis- 
tochemistry showed strong staining of chromogranin A and 
a Ki67 index of < 0.5%. Exome sequencing revealed 13 SNVs, 
three were classified as benign and nine as unknown. One 
missense variant was assessed as possibly pathogenic, located 
at position 2372A<T (rs77724903), resulting in the amino 
acid substitution Tyr791Phe, in the proto-oncogene 
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Figure 2 

Detailed coverage at bases annotated for PCC susceptibility genes. 

tyrosine-protein kinase receptor (RET) gene (Fig. 3). The 
pathogenicity of RET Tyr791Phe is disputed (22, 23, 24). 

Patient 3: NF1 variant 

A 65-year-old woman with a two-decade history of 
hypertension and newly diagnosed adenocarcinoma of 
the breast was investigated due to abdominal discomfort. 
Computed tomography of the abdomen showed a lesion in 
the left adrenal gland and subsequent urine collection 
revealed high levels of noradrenaline. The patient was 
operated with a left-sided adrenalectomy and the path- 
ology report described a cystic PCC, 60 X 50 mm in size and 
a weight of 59 g. The immunoreactivity of chromogranin A 
was strong and Ki67 index was < 1%. Exome sequencing 
revealed ten SNVs, nine were classified as unknown. One 
missense variant was assessed as probably pathogenic, a 
nonsense variant located at position 910C>T 
(rs76015786), resulting in the amino acid substitution 
Arg304Ter, in the neurofibromin (NF1) gene. The pheno- 
type of Arg304Ter is described in related tumours and we 
assessed the variants as probably pathogenic (25, 26, 27). 
However, this variation could not be confirmed by Sanger 
sequencing. 

Discussion 

Genetic screening of PCC and PGL has been found to be 
beneficial in practicing centres (28). Utilizing novel 
sequencing techniques have a potential to decrease costs 
and time consumption, thereby lowering the threshold 
for inclusion. 



Finding of the clinically relevant allele RET Tyr791Phe 
clearly exemplified the potential of NGS as a diagnostic 
tool, while SDHC ProllOSer illustrated the complexity of 
possibly pathogenic, but previously unknown, variants. 
NF1 Arg304Ter displayed potential methodology conflicts; 
however, conflicts in results generated by certified clinical 
genetic laboratory testing using Sanger sequencing have 
been reported (29). 

Price 

A direct cost comparison between whole exome sequencing 
and traditional methods is complicated due to the invaria- 
bility in which genetic screening is currently performed. The 
total cost for analysing the most frequently mutated genes 
(SDHB, SDHD, VHL and RET) is estimated to be 3500 USD 
(12, 30) and if screening all ten susceptibility genes, we 
estimate the cost to be 10 000 USD. The use of genetic 
screening algorithms may clearly reduce costs but can be 
time consuming and are designed for scenarios in which 
patient characteristics clearly indicate specific loci (31). The 
costs of exome enrichment and sequencing in this study 
were considerably lower than those of traditional screening, 
and as the techniques develop fast, further cost reductions 
are expected (Hayden EC, The $1000 genome: are we there 
yet?, 2012, NATURE NEWS BLOG). 

Performance 

Raw sequences generated by NGS require computational 
processing, mapping reads to a reference sequence and 
calling variants between the two. Results generated by NGS 
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Patient 1 
SDHC Pro110Ser 
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Patient 3 
NF1 Arg304* 
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Figure 3 

Screenshot of sequences as displayed in CLC genomics 4.9. From above: 
reference sequence, consensus sequence and mapped tumour reads (blue 
colour, intact read pairs; green colour, broken forward read; red colour, 

should be confirmed with a principally different sequencing 
chemistry. The bioinformatics process should deliver a 
defined list of variants. Stochastic false positives occur at 
relatively high frequencies but may be filtered given that the 
position is covered by an adequate sequence depth (about 
30-fold). False negatives are more insidious and may be 
caused by incomplete enrichment, uneven sequencing 
coverage or faulty bioinformatics processing (17). Addition- 
ally, a high sequence depth allows NGS to detect alleles at 
thresholds below that of Sanger sequencing. These specifi- 
cations predict built-in conflicts in which NGS may generate 
probably pathological variants that cannot be validated by 
Sanger sequencing (i.e. patient 3) . Other validation methods 
(e.g. pyrosequencing) may detect alleles at a lower frequency 
but at a higher cost (32). A situation with multiple unknown 
variants has been expected and was confirmed by this study 
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broken reverse read). Below: chromatograms of the corresponding 
sequences generated by Sanger sequencing. 



(1). Evaluating the significance of such 'genetic incidenta- 
lomas' may be extensively recourse demanding and clearly 
demonstrates the need to further expand and curate allele 
databases such as dbSNP and LOVD. 

Time constraint in a clinical setting is also a challenge. 
A diagnostic test must have a throughput measurable in 
weeks. In theory, the NGS process can be tuned to deliver 
results within 1 week (33). With a pre-defined bioinfor- 
matics assay, the necessary computational analysis for our 
experiments had a throughput of < 24 h, including a total 
in-house hands-on time of <30 min. 



Exome vs targeted enrichment 

Sequencing of tumour tissue with complete exome coverage 
differs from the current diagnostic procedure in which limited 
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loci in germline DNA is analysed. The theoretical potential is 
to provide improved prognostic and/or predictive infor- 
mation to individualize the care of the patient (34, 35). 
Managing the surplus of genetic information that does not 
involve genes associated with the specific disease nor with its 
treatment is problematic (36). Ethical and financial frame- 
works regarding rights and responsibilities of patients and 
providers need to be implemented (16). While the concept of 
personalized medicine based on whole genome or exome 
coverage needs to mature, there are immediate benefits of 
NGS in clinical situations such as in the PCC and PGL 
patients. Examining the available sequencing apparatus and 
the upcoming pipeline, applications classified as medium 
capacity are closest to fulfilling the optimal specification of 
requirements for this situation: low costs, fast throughput, 
high accuracy and a capacity matching the size of loci 
conferring susceptibility to PCC and PGL (35, 37). 

Limitations of this study 

Exome enrichment resulted in a coverage of above ten 
reads for more than 90% of targeted regions. However, 
detailed coverage analysis (Fig. 2) revealed PCC loci 
lacking 10 X coverage (VHL gene had 10 X coverage at 
only ~50% of bases). Use of exome enrichment prevents 
analysis of structural variants (38), thus limiting the 
comparison of NGS results with current standards (Multi- 
plex Ligation-dependent Probe Amplification). 

As tumour tissue was sequenced without matched 
constitutional DNA, the bioinformatics process could not 
classify variants as somatic or constitutional. Therefore, 
future studies should include multiple cases with matched 
tumoral and normal tissues from patients having charac- 
terized pathogenic disease-causing variants. The method 
for target enrichment should be selected with regard to 
expected coverage at PCC and PGL disease causing loci. 

Sanger sequencing as a validation method for NGS results 
have been replaced by other more sensitive methods (39); the 
finding of NF1 Arg304Ter by NGS, but not by Sanger, is an 
example of inconclusiveness between these two methods. 

Conclusion 

We conclude that utilizing NGS may serve as a fast and cost- 
effective method in the clinical genetic screening of patients 
with PCC and PGLs. In order to facilitate the introduction of 
NGS as a diagnostic application, we identified process 
optimization, characterization of unknown variants and 
determination of additive effects of multiple variants as key 
issues to be addressed by future studies. 
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